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WHAT NEXT IN THE PSYCHOLOGY OF MUSICAL MEASUREMENT?’ 
ROBERT W. LUNDIN 


Hamilton College 


In considering, “What next in the psychology of music?,” | should like to 
mention some problems involved in the measurement of musical talent or ability 
and in the prediction of success in music. Although there are many tests on the 
market, some old some new, some good some bad, three batteries standout, first, 
because they are commercially accessible and secondly, because considerable 
research has been reported involving their use. It is entirely possible that better 
tests exist or are yet to be constructed, but at this point we have little or no access 
to them so we shall have to hold in abeyance our appraisal of these other meas- 
ures until some future time. 


Let us consider the three most widely used measures of musical aptitude — 
the Seashore Measures of Musical Talents (17), the Kwalwasser-Dykema Music 
Tests (6), and the Drake Tests of Musical Aptitude (2) — in order of their historical 
development and suggest what might be done next with them to give a better 
understanding of what behaviors they are measuring and a more valid basis for 
the prediction of future success in music. In my opinion, all of these tests have 
some value, although | realize that this attitude might not be shared by all others 
interested in this subject. 


A. THE SEASHORE MEASURES OF MUSICAL TALENTS 


During the past four decades, extensive research has been carried out with 
these measures. Much of it has been done with the earlier 1919 version (16). The 
revised 1939 version (17) appears to be an improvement, at least from the 
standpoint of reliability. Although the 1956 revised manual published by The 
Psychological Corporation (18) reports no test intercorrelations, the evidence has 
strongly supported the notion that the tests are measuring rather specific sensory 
capacities. A recent study by McLeish (10) sheds some light on this problem. The 
object of his study was to validate the Seashore tests by factoral procedure on 
the grounds that conventional validation methods which used external criteria 
were logically unsatisfactory. A criterion admittedly imperfect, such as ratings, 
is usually employed to demonstrate the worth of a test which is to replace it. 
The intercorrelations of the Seashore tests were factorized by Burt’s method of 
simple summation. They reveal a general factor accounting for 29% of the 
variance, a bipolar factor accounting for 10%, specific factors accounting for 
38% and error factors accounting for 23% of the variance. McLeish identifies 
the general factor as being closely related to musical ability and appreciation. 
The bipolar factor is evidently a classification factor. It appears to subdivide the 
six tests according to their concrete content, contrasting (a) those that depend 
chiefly upon immediate discrimination with (b) those that depend upon immediate 
memory. The specific factors for Pitch, Loudness, Time, and Rhythm are decidedly 
large. The saturations of these factors may be taken, McLeish believes, as indi- 
cating to some degree the atomistic nature of the battery, that is, the extent to 
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which it depends upon a number of specific, mutually independent abilities. He 
suggests that their use will be most effective if the scores are weighted in 
accordance with calculated regression coefficients. 


If this is the case, then the test should be used where rather specific abilities 
are indicated. Seashore calls our attention to the fact that a poor sense of pitch 
discrimination disqualifies the player for stringed instruments, while a poor sense 
of loudness is a more serious handicap for the pianist. Many music educators 
have accepted these statements for years. But how many specific validation 
studies have been conducted along these lines? It might be worthwhile to select 
stringed instrument players with various degrees of rated ability, or pianists, to 
see just how important the relationships are between the performance and test 
results. Seashore assumed that the better the discrimination, the better the musi- 
cianship. How true is this? Perhaps merely a minimum cut-off point is needed. 
A moderate degree of pitch discrimination may be necessary and, beyond that 
point, other variables contribute to success. Specifically, what roles do pitch, 
loudness, etc., play in performance in various instruments? 


Too many of our validation studies carried out with this test, as well as with 
other musical measures, have used a very general externcl criterion of “talent” or 
“over all ability.” Often, the correlations have been depressingly low. Let us 
find the specific performances where these abilities are most needed before we 
discard the Seashore tests as being useless measures of musical talent. 


Too often we assume relationships existing between certain variables 
because they appear so on the basis of their face value. Let me illustrate. In a 
review of the literature on absolute pitch, Neu (14) concluded that absolute 
pitch was nothing more than a highly developed degree of pitch discrimination. 
On the basis of the evidence he reported, this assumption appeared reasonable. 
However, in a recent study by Oakes (15) two tests were constructed, one for 
pitch naming, intended to measure the so-called absolute pitch, and the other a 
test of pitch discrimination constructed along the lines of the similar one in the 
Seashore battery. These tests were given to one group of music students who 
were reported by their teachers as possessing absolute pitch, and to a second 
group of music students who themselves claimed to have absolute pitch. Other 
groups consisted of students with lesser and no amounts of musical training and 
experience. Oakes found for all subjects the relationship between pitch naming 
and pitch discrimination to be about .41. When the so-called absolute pitch 
students were considered alone, the relationships were far from perfect. Many 
of the best students on the pitch naming test did not rank high on pitch dis- 
crimination. Although we have tests of specific abilities, we have not gone far 
enough in finding out the kinds of musical performance where these abilities are 
most necessary. 


We all recall Stanton’s extensive study (19) using the Seashore tests at the 
Eastman School of Music in the 1930's. The obvious weakness of this analysis 
was found to be that the contribution of the Seashore tests, intelligence, case 
history, and other variables was not made clear. A recent study by Wilson (20) 
found intelligence to be useless in predicting success in specific music theory 
courses such as Dictation, Harmony and Sight-reading; but the Seashore tests of 
Loudness, Tonal Memory and Timbre had some predictive value in the Dictation 
and Sight-reading courses. 

Would not a repetition of the Stanton-type study using, first of all, the 
revised version of the Seashore tests, perhaps the Drake tests, as well as some 
other measures, at a school of the caliber of Eastman be worthwhile? This time 
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let the results be presented in a manner so that the specific contributions of the 
music tests, intelligence, case history, etc., can be indicated. 


B. KWALWASSER-DYKEMA MUSIC TESTS 


As a means of evaluating the K-D tests, the manual (6) is extremely unsatis- 
factory. No mention is made of reliability or validity. However, we are not 
without information on this matter. Summaries of studies on these tests are 
reported by Beinstock (1), Farnsworth (3) and Lundin (9). These have indicated 
the reliability for the individual tests to be so low as to render them practically 
useless. Validities are no better; in many cases, even worse. Still, the tests are 
widely used. 


The publication of Kwalwasser’s recent book (7), Exploring the Musical 
Mind, brings to us a plethora of studies heretofore unknown. These are the result 
of about two decades of work done by Kwalwasser and his students in the form 
of unpublished masters’ and doctors’ theses. Of particular interest are Kwal- 
wasser’s reports relating the performance on his tests, along with other measures 
of musicianship, and measures of physical characteristics, as well as tests of 
equilibrium, steadiness, tapping speed, finger dexterity, tongue ability, etc. 


Most of the studies have used the entire battery of the Kwalwasser-Dykema 
tests. When this is done, the reliability is somewhat improved so as to make the 
tests more useful. Kwalwasser reports that music students are superior to non- 
musicians in tests of motor precision, mechanical aptitude, and finger dexterity, 
but are physically lighter, shorter and less athletic. Of course, in most cases the 
criterion of a musician is a good performance on his tests. 


These investigations suggest some possibilities for further study along the 
line that | mentioned earlier. What specific relationship, if any, exists between a 
pianist’s performances and tests of finger dexterity, rate of manipulation, etc.? 
How do violinists fare on these performances? What kind of motor precision do 
clarinet and trumpet players need? 


Kwalwasser and his students have investigated the relation between musical 
talent and tongue agility (the number of times per second a person can say 
“du, du, du”). He reports that musicians do better, but he mentions that norms 
do not exist between this ability and performance on specific instruments. 


In spite of the fact that investigators have found the reliabilities of the K-D 
tests below minimum standards, all hope is not lost. The K-D tests have a strong 
appeal because they are interesting to take and appear to be more lively and 
musical than the Seashore tests. A recent report by Holmes (5) suggests a way of 
increasing the reliability of the K-D tests by using a new set of scoring keys and 
instructions for the original phonograph records. His changes go something like 
this: For example, in the original test of tonal memory, the subject is directed to 
report the musical phrases as same or different. In the new directions a greater 
range of response is called for. The subject is asked to report E if the two phrases 
are equal, DL if the change in the second phrase is lower, DH if the change is 
higher, and D if it is different but subject cannot tell the direction of the change. 
In scoring, if DL is correct, two points are given. Other D responses get one point 
and E gets no points. The new directions reduce the element of chance and take 
advantage of possibilities for finer discrimination. Other tests are altered in 
directions and scoring in a similar fashion. As a result of these changes, Holmes 
reports improved reliabilities as follows: Tonal Memory, .73; Time, .50; Tonal 
Movement, .88; Rhythm, .71; Quality Discrimination, .70; Intensity, .79; Pitch, .72; 
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Melodic Taste, .43. Using these eight tests together, a total reliability of .91 is 
reported. Although these reliabilities leave much to be desired, they are much 
higher than those previously reported (1, 3, 9) and compare more favorably 
with the Seashore tests than before Holmes’ improvement. 


Perhaps we are now in a position for some validation studies which will be 
more meaningful. Using this revised technique, better norms may be reported and 
perhaps we will find the tests to discriminate better at the higher levels than 
they have done previously. 


C. TESTS OF MUSICAL TASTE 


Of the tests so far discussed, most measure discriminative and memory 
abilities. Of course, two of the Kwalwasser tests measure what in a general sense 
we may refer to as musical taste. This is an area of aesthetic judgment where 
much work still needs to be done. It is a difficult behavior to measure objectively 
because of the lack of absolute criteria for scoring. Farnsworth (4), and Hevner 
and Mueller (12, 13) have demonstrated that although musical taste is a stable 
phenomenon in our society, it is subject to change as the years go by and shows 
deviation in different geographical areas of our country. 


Many of the tests in this area have not fared very well. We have already 
noted the poor reliability of the Kwalwasser test, even in the Holmes revision. 
The Seashore test of Consonance in the original battery was replaced by the new 
Timbre test in the revised version because of lack of objective criteria in scoring 
as well as poor reliability. 


In my opinion, the best and most reliable test of this type is the Oregon 
Musical Discrimination Test of Hevner and Landsbury (11). Unfortunately, this 
is no longer available commercially, although there still are many requests for it. 
The test consists of 48 pairs of short musical selections reproduced on phonograph 
records. The version in the original is by a generally acclaimed composer. The 
other is a distortion in which rhythm, melody or harmony has been spoiled. 
These tests were administered to a group of musical experts, including per- 
formers, composers and critics. No item was retained unless the original version 
was unanimously chosen as the preferred one by the experts. Although the 
Oregon tests proved too difficult for some elementary school children, when 
college students were used a reliability of .88 was reported. It seems a pity that 
more use of this test has not been made in aptitude studies. It appears to measure 
an aspect of musical behavior quite objectively, which is not included in the 
Seashore or Drake test batteries. | would like to see a revision of this test, making 
it more applicable at lower age levels so that it could again be made available 
for further research. 


D. THE DRAKE TESTS OF MUSICAL APTITUDE 


The most recent addition to the field of musical aptitude testing has been the 
Drake tests of Rhythm and Musical Memory, published in 1954. The manual (2), 
published by Science Research Associates, reports good reliabilities for both 
tests. It seems that the publication of these tests is a valuable and much needed 
addition to this field and makes available a new instrument for further research. 


Drake reports remarkably good validities using an external criterion of 
“talent” which is definded in terms of expression in playing and rapidity in 
learning music. Validities reported range from .31 to .91, with a majority running 
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around .58. Since these validity coefficients involve a rather general criterion, 
they turn out remarkably well. How might they compare if more specific ratings 
were used for various kinds of musical performance, as well as work in various 
music theory courses? 


Another interesting problem involves the Drake Rhythm test. This test is 
unlike that in the Seashore or Kwalwasser-Dykema series. It involves the ability 
to “keep rhythm” and maintain a particular tempo despite distraction. It is 
really a test of tempo. Drake reports the correlations between his test and the 
Seashore test of rhythm to be very low, ranging from .02 to .11. It would appear 
that two different variables are being measured. Further information concerning 
the relationship between the ability to keep rhythm and rhythmic discrimination 
might be worth having. 


Drake does not report comparable correlations between his test of musical 
memory and the similar test in the Seashore battery. However, Lundin (8) found 
a correlation between the two tests of .56, using the earlier edition of the Drake 
memory test. Here again, further investigation seems worthwhile, particularly if 
the performances of musicians and non-musicians were compared. 


How are the abilities measured by these two tests subject to improvement 
by training? Drake mentions that musical memory can be improved but gives no 
studies to support his statement. We are all familiar with the studies of Wyatt in 
1945 (21) on the improvability of pitch discrimination. She reported marked 
improvement with the right kind of training even when the poorest subjects 
were used. The author would be interested in locating some of the real failures 
as measured by both Drake tests, to subject them to specific training to find 
out at what age levels and in which ability training is most helpful. Drake reports 
norms for children as young as seven years on the musical memory test. 


The intercorrelations between these two tests are reported to be quite low. 
Yet, both correlate well with the general criterion used for validation. Drake 
suggests that these two factors along with pitch discrimination make up the 
musical aptitude. Would not a factoral analysis be useful using these two tests, 
one of pitch discrimination along with others? The author's impression is that 
the musical memory test might include a number of different characteristics. 


In the manual Drake says, “It is believed that not all musicians need the 
same degree of ability in both factors. The drummer, for instance, is probably 
dominant in rhythm and the performing artist probably dominant in musical 
memory.” It would be worthwhile finding out if these “probabilities” are really 
true and how much each test contributes to the prediction of success in various 
fields of musical performance. 


To my knowledge, the vast majority of research done with these tests has 
been performed by Drake himself. Now that they are readily available com- 
mercially, other investigators might find them useful in research. 


Finaliy, | should like to know something about the relationship between 
musical memory as measured by our existing tests and other kinds of memory 
behavior, such as memory for designs, digit memory, associative memory and 
so forth. How is musical memory related to the general M factor reported by 
Thurstone, for example? Investigators have already told us that musical aptitude 
correlates only slightly with general intelligence. But when we examine the 
matter more closely, might some relationship exist? 


Although we have gone a long way since the days when the Seashore tests 
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first appeared on the market, it seems evident that there is still need for study 
before we can find ourselves in the same position as those who have investigated 


some other aptitudes, such as intelligence. Perhaps the best test is still to be 
discovered, but in the meantime we should be able to predict more effectively 
with the measures already available to us. 


20. 


21. 
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RORSCHACH TEST BEHAVIOR AND RELATED VARIABLES 


A. JACK HAFNER 
Washington University School of Medicine 


Traditionally an individual’s behavior on the Rorschach Ink Blot Test has 
been conceptualized as a function of the individual’s “basic” personality struc- 
ture and as such, to be relatively free of external influence. An individual's 
Rorschach constellation was considered to be essentially invariant, aside from 
basic changes in the personality structure itself. Later, this became known as 
the doctrine of immutability and gave the Rorschach test a unique status in the 
area of psychological testing. This conception of Rorschach behavior has been 
held by such leading proponents of the method as Beck (2), Klopfer (20), Hertz 
(15), Halpern (13), and Piotrowski (25). For example, Klopfer (20, p. 26) states: 
“The Rorschach method does not reveal a behavior picture, but rather shows — 
like an X-ray picture — the underlying structure which makes behavior under- 
standable.” 


Following the lead of the experts, the clinical psychologist who utilizes the 
Rorschach in his work interprets Rorschach behavior in light of the above 
stated considerations. This is exemplified in the statement by Kurtz and Riggs 
(21, p. 465): 

In using a projective technique, the clinician usually 
starts from the premise that the formally scored variables 
reflect relatively central or permanent aspects of the per- 
sonality. He does not expect momentary situational factors or 
the subject's casual expectations about the task to distort or 
even seriously color, these scored variables. 


The viewpoint that Rorschach behavior is relatively free of external 
influences and reflects primarily the individual’s basic personality structure has 
met with increased criticism in recent years and has been undergoing some 
rather intensive experimental investigation. In general, the results of these 
investigations make the assumption of the doctrine of immutability rather un- 
tenable. As early as 1934 Bleuler (4) pointed out from his extensive clinical 
experience that a number of external factors could influence a person's set, 
which in turn would alter his behavior on the Rorschach decisively. Some of the 
factors he emphasized were the influence of hospitalization on the occurrence 
of anatomy responses, the familiarity of the particular clinical setting, and the 
personal relationship between the individual and the test administrator. A review 
of the literature since Bleuler’s publication indicates that most of the experimental 
investigations of external influences on Rorschach behavior have been carried 
on since 1945, with the majority of the work being done in the last six years. 


One of the early attempts to investigate the influence of instructional set 
was Fosberg’s study (11) of the susceptibility of the Rorschach to falsification. 
One group of Ss was given instruction to make the best impression, and another 
group to make the worst impression. Fosberg concluded that the “permanent 
underlying personality” emerged and that the Rorschach could not be faked. 
However, Cronbach (8) has pointed out that the statistic used by Fosberg was 
likely to give spuriously high correlations and at the same time mask real 
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differences. A similar study was conducted by Carp and Shavzin (6) and, 
although no group differences resulted, there were wide individual differences, 
with some Ss able to alter their responses significantly. In the most recent study 
of this kind, Feldman and Graley (10) obtained records that indicated maladjust- 
ment when they gave Ss instructions to simulate abnormality on the Rorschach. 


Hutt, et al. (16) investigated specific experimental sets and told one group 
of Ss to pay attention to segmental areas, another group to find movement 
responses and a third group to give only good form responses and responses 
combining form, color, and human movement. On a test-retest basis sig- 
nificant results were obtained. In another study Coffin (7) obtained the S's 
evaluation of occupations. Following this, he gave the Ss an article that ap- 
peared to be a reprint from a journal which suggested the ways in which 
members of certain occupations responded on the Rorschach. Next, the Ss were 
administered the Rorschach. Coffin found an influence on the Rorschach 
responses, with Ss responding the way that members of their particular preferred 
occupations were supposed to respond. A similar study was done by Abramson (1) 
using test-retest, with one group being told, following the first administration, 
that successful business and professional people responded to the Rorschach 
with responses utilizing a whole blot and that the approach to the test correlated 
highly with intelligence. Another group was given the same information, except 
that large detail responses were emphasized instead of whole responses. 
Abramson found that the response areas changed significantly upon retest in 
the direction of the two sets. 


In another group of investigations dealing with the influence of previous 
situations on Rorschach behavior, Rabin, Nelson, and Clark (26) used three 
groups of Ss, with one group waiting in a room with sexually stimulating pictures 
on the wall, another group waiting in a room with anatomical pictures and 
a control group waiting in a plain room before the Rorschach was administered. 
In comparison to the control group, a significant increase in the number of 
sex responses was found for the group that waited in the room with sexually 
stimulating pictures, but no significant differences were found in the number of 
anatomy responses for the group that waited in the room with the anatomy 
pictures. The sex of the examiner was also found to significantly affect the 
protocols. In an experiment by Lord (22) Ss were given the Rorschach following 
a negative affect loading situation, a positive affect loading situation and a nevu- 
tral situation, with the sequence varied for the different Ss. She found that the 
Rorschach behavior varied significantly as a function of test repetition, positive 
and negative rapport conditions of administration, and examiner differences. 


There have been other Rorschach studies investigating the influence of 
examiner differences on Rorschach behavior, such as that by Sanders and 
Cleveland (27) in which the Rorschach was given to the examiners and evaluated 
for hostility and anxiety indices. This was checked against the records of Ss tested 
by the examiners. Significant differences were found in the S’s records relating 
to the examiner differences in hostility and anxiety. Gibby, Miller, and Walker 
(12) found significant differences in certain Rorschach determinants elicited by 
different examiners with matched groups. 


In a study encompassing a number of the above factors, Luchins (23) tested 
a group of service personnel and found a great variance from Rorschach norms. 
After that, he gathered information about the Ss tested and conducted interviews 
with them and later gave them a retest on the Rorschach. From his results he 
concluded that the Rorschach test responses were influenced by attitudes towards 
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the test and the examiner, previous experience, and educational, occupational 
and cultural background. 


Some investigations have been carried out concerning the influence of the 
amount of time spent in responding to the Rorschach cards. Weisskopf (33) 
did a preliminary study in which she gave 10 Ss the Rorschach, but allowed 
them to spend only 10 seconds on a card. She retested the Ss later without 
time limitations and found suggested differences between the two types of test 
administration, although no statistical tests were carried out. Siipola and Taylor 
(31), using 20 simplified ink blots derived from the regular Rorschach cards, 
administered them to a group of Ss who were instructed to give one response 
to each card as fast as they could. They also displayed a timer in front of the 
Ss and told them they would be timed on each card. In comparison with a 
control group that had no time limits, the experimental group showed a sig- 
nificant increase in the “primitive, non-constructive types of reactions” to ink 
blots. In other research on time factors Stein (32) used a tachistoscopic adminis- 
tration of the Rorschach with four different exposure times, varying from .01 
seconds to full exposure. Each S was given all the exposures, with part of the Ss 
starting with the short exposure time and the others starting with full exposure. 
Stein found significant changes in the Rorschach protocols with an increase in 
exposure time. 


One of the first experimental investigations of situational influences in 
regard to Rorschach behavior was conducted by Kimble (18). He used the test- 
retest method and administered the Rorschach to his Ss in two different settings. 
One Rorschach was given under standard test conditions and the other was 
given in the student union in a social situation with at least two other people 
present. Kimble found a larger use of color on the Rorschach given in the social 
setting than under standard test conditions. 


The final group of studies in the area of situational factors is concerned 
with the influence of stressful situations on Rorschach behavior. Eichler (9) investi- 
gated experimental stress consisting of a threat of shock and Rorschach indices 
of anxiety. The Rorschach was administered to one group in the stress situation 
and to a control group under regular conditions. He found the anxiety indices 
to be significantly greater for the stress group than for the control group. 
Berger (3) administered the Rorschach under the real life stress of TB patients 
on admission to a hospital and found significant differences when retesting the 
patients six weeks later in regard to anxiety indices. Klatskin (19) also adminis- 
tered the Rorschach under real life stress, in this case to hospital patients before 
major surgery, and found significant differences in anxiety indices between that 
group and a control group in regard to the Rorschach. In an experimental stress 
situation Calden and Cohen (5) administered the group Rorschach to high school 
seniors in an “ego-involvement” situation, in which the Ss were told that the test 
would be used in helping them select a job or in advising them about college. 
The Rorschach was also given to another group with “low ego-involvement” 
instructions. These two main groups were each subdivided into three groups, 
with the Rorschach being defined to one subgroup as measuring intelligence, to 
another subgroup as measuring imagination, and to the other subgroup as 
measuring “nervousness.” The protocols for the “intelligence” groups gave the 
“intellectually controlled stereotyped behavior found in 1|.Q. testing,” the 
“imagination” groups gave significantly more fantasy responses, but no clear-cut 
pattern emerged for the “nervousness” groups. Significant differences were also 
found between the “ego-involvement” and “low ego-involvement” groups. Henry 
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and Rotter (14) also investigated situational stress in a study in which one group 
of Ss were told, before administering the Rorschach, that the test was used to 
discover serious emotional disturbances. The resulting Rorschach protocols for 
this group were significantly more representative of a cautious conforming 
approach than were those of a comparative control group. In another study by 
Schwartz and Kates (30), between test and retest experimental Ss were given 
a personality evaluation, supposedly based on the first test and indicating that 
they were poorly adjusted. In comparison with a control group, the experimental 
Ss’ retest Rorschach protocols were indicative of more behavioral constriction. 


From the review of these various investigations, the conclusion which emerges 
is that Rorschach behavior is not a function of one factor alone, but of a number 
of different factors. This interpretation of Rorschach behavior has been empha- 
sized in some more recent writings by authors who have attempted to make a 
more comprehensive analysis of the test situation and resulting test behavior. 


Schachtel (29) selected four factors which he considered the most important 
common elements of the Rorschach testing situation. They are: (1) the inter- 
relationship of the examiner and §S in the relationship of the test situation, 
(2) the fact that a task is given to the S by the examiner, (3) the usual awareness 
of the S that the examiner or someone else will draw certain conclusions con- 
cerning the S from the way he handles the task, and (4) the specific qualities of 
the task. Sarason (28) analyzed what he considered the determinants of behavior 
in a testing situation and concluded that there were six main factors: (1) the 
nature of the stimulus material, (2) the nature of the instructions, (3) the purpose 
of the testing, (4) the time and place of testing, (5) the test administrator, and 
(6) the attitudinal factors of the S related to previous conditions of learning. 
In another analysis of the testing situation, D. R. Miller (24) arrived at five 
categories which he considered important variables in the situation: (1) setting 
or the characteristics of the situation in which the test is taken, (2) task or 
nature of the test, how it is introduced and the responses required from the S, 
(3) the examiner's social stimulus value and character structure, (4) the S’s 
character structure, and (5) the relationship between the examiner and the S 
in terms of the configurations of interactions. He also emphasized the relative 
interdependence of all five categories and stated that the task and setting 
categories are not easily separated. 


Although there are a varying number of factors given by the above 
writers and different word descriptions used, there is general agreement on many 
of the factors considered to have an influence on Rorschach behavior. The main 
factors pointed out by them, as well as the research reviewed, suggests that 
Rorschach behavior is like other kinds of psychological behavior. If this is the 
case, principles which apply to any analysis of psychological behavior should 
also apply to Rorschach behavior. This can be seen more clearly by taking a 
schema for psychological behavior and comparing Rorschach behavior to that 
schema. Kantor’s (17) analysis of a psychological act centers around interbe- 
havior between an organism and a stimulus object. This interbehavior is a 
function of a number of factors, all of which are of equal importance in under- 
standing a psychological act. The basic unit of the act is the behavior segment, 
and some of the main constructs in its analysis are the response and response 
function, and the stimulus and stimulus function, with the psychological inter- 
behavior centering about the mutual interaction of stimulus and response func- 
tions. The response or response patterns are composed of reaction systems. 
The stimulus and response functions are built up during the interbehavioral 


history, or contacts of the organism and stimulus object. The interbehavioral his- 
tory is continuous with preceding and succeeding behavior segments. In addi- 
tion, the interbehavior in the behavior segment takes place under specific 
setting and mediating circumstances. If Rorschach behavior is analyzed ac- 
cording to this system in which psychological behavior is regarded as dependent 
on the functional relationships of the components of a situation, it is understand- 
able that examiners or observers (auxiliary stimili), situational stress (setting 
factors), previous experience and preparatory set (preceding behavior seg- 
ments and interbehavioral history), attitudes towards the test (response functions), 
as well as the Rorschach card characteristics (stimulus and stimulus functions) and 
personality characteristics (reaction systems) will influence Rorschach behavior. 
An analysis of this type makes evident the untenability of the concept that Ror- 
schach behavior is a function of the “basic” personality alone. 


In spite of the number of investigations which point to the contrary, the 
Rorschach test continues to be interpreted by the clinician in his every day work 
as though it reflected the individual’s personality structure alone. Very little 
attempt has been made at trying to standardize administration procedures 
themselves, which would seem to be an important starting point. Also, very little 
has been done in the way of systematically investigating, as part of the testing 
routine, possible situational influence. This might be done through a standard- 
ized interview following the testing itself or by questionnaire. Whatever opera- 
tions of this sort are carried out, the need seems obvious for evaluating Ror- 
schach behavior on the basis of the behavioral situation as a whole and employ- 
ing some systematic interpretion of psychological behavior in this analysis. Only 
through a program of this sort does it seem that the ultimate utility of the 
Rorschach technique will be realized, both in contributions to personality theory 
and to clinical practice. 
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EVIDENCE FROM RETROGRADE AMNESIA FOR A UNIT OF 
BEHAVIOR HIGHER THAN THE STIMULUS-RESPONSE 


JOHN BUCKLEW 


Lawrence College 


A survey of the literature of retrograde amnesia (R. A.) following head 
trauma indicates that its length can only be very imperfectly accounted for by 
the severity of the trauma, the extent or location of the injury, the duration of 
the period of anterograde amnesia, or other such variables (9, 10, 11, 12). 
Neither is it clear why some individuals will suffer an R.A. following a head 
injury, while others with similar injury will not. Much of this may be due to 
individual differences, but the writer feels that one reason for the lack of relation 
between injury and the length of the R.A. period lies in the close connection of 
the amnesia to the behavioral events immediately preceding the injury. Generally 
speaking, this relation seems to be as follows: if the trauma interrupts a distinct 
goal directed activity the R.A. will extend back to a point just after the initiation 
of this goal direction. We will refer to such a unit of action as a motivational 
behavior segment. 


A case cited by Conklin (3) will serve as illustration. A college student leaves 
his laboratory class in psychology in the afternoon to participate in football 
practice. During scrimmage he is hit on the head and rendered semi-conscious, 
although continuing to play. After recovery from the blow he discovers that he 
remembers nothing after leaving the psychology laboratory that afternoon. It 
can be seen that the R.A. covers a period of time in which a new goal direction 
had been operating — that of leaving the classroom in order to participate in 
football practice. In another case known by the writer, a girl suffered an R.A. 
following a head injury sustained in a fall from her riding horse one Sunday 
morning. She remembers getting up in the morning, dressing in riding clothes, 
eating Sunday breakfast, and leaving for the riding stable, but nothing there- 
after. This is quite similar to the case from Conklin. 


The following case, taken from the writer's own files, better illustrates the 

principle because the actual point of onset of the R.A. was determined as exactly 
as possible through interview. The events are narrated in chronological sequence 
in order to make clear the origin and duration of the amnesic period preceding 
injury. 
Case of William A., age 17 years 10 months. — Bill was working in a grain 
elevator in North Dakota during the summer vacation of 1949. While there, he 
went on a picnic with a party of young people to Killarney, Manitoba, in Canada, 
about 60 miles away. There were several automobiles in the party, one driven 
by Bill containing three passengers. The picnic took place about noon, and 
around six in the evening the party prepared to leave with the intention of 
returning to a town near home to go roller-skating. 


Bill remembers loading the car with picnic equipment and hollering to a 
girl to hurry up as his was the last car to leave. He remembers getting into the 
car, which was facing the lake, starting the motor and taking a last look at the 
lake, and opening the car door to steer past a hillock in back. At this point the 
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R.A. begins. He has no memory of shifting into forward gear or of leaving the 
picnic area. 


His next clear memory, about four and one-half hours later, is that of being 
wheeled down the corridor of a hospital on a stretcher. The intervening events 
were reconstructed piecemeal from the accounts of others. He was some distance 
behind the others and evidently, with youthful exuberance, decided to overtake 
and pass them. He passed one car in the group and had swerved into the op- 
posite lane of the dusty road to pass another when he crashed into a car coming 
the other way containing a man, his wife, and their daughter. Fortunately, no one 
in either car was killed. This ends the period of the R.A., which covers 10 to 
15 minutes. 

After recovery of consciousness in the hospital about 10:30 that evening, 
Bill remembers asking the nurse what happened and, after her reassuring answer, 
slowly recalling that it was he who had been driving the automobile. He had 
sustained a severe cut on the right frontal region of the head, for which an 
operation was performed, and minor cuts and contusions elsewhere. Hospitaliza- 
tion lasted for about a month. 


Bill thinks it was one or two days later, while being questioned by the 
mounted police, that he became fully aware that he had no memory of events 
preceding the accident. Later, while talking to his father, he realized that he 
remembered nothing after backing the car out of the picnic grounds. Up to the 
time of interview, four years later, the R.A. had not disappeared or been reduced 
in length. 

In their 1946 paper on traumatic amnesia (13), Russell and Nathan have 
given four accounts of R.A. which cover a period of 24 hours or more. Such 
lengthy R.A. periods are very rare and, as Russell has shown (12), are more apt 
to follow a very severe injury, as judged by the duration of the post-traumatic 
period of unconsciousness. These cases, briefly summarized here, constitute a 
special test of the principle. Three of them seem to fit, in the special sense that 
the R.A. dates back almost to the beginning of an important change in activity. 
However, the time span seems too long to be considered as a unitary segment of 
activity. Two of the accounts concern flight officers of the English air force during 
the second World War, who crashed during regular flight maneuvers with R.A.’s 
of two days and eight days respectively. In the first one, the two days cover the 
period during which the officer had been at the post where the crash occurred. 
The last thing he remembered was meeting a fellow officer en route. The other 
R.A. also covers the period during which the officer had been with a new unit. 
He last remembered arriving and noticing his room. Outside of two vague 
“memory islands” during the week, everything subsequent to the arrival was 
lost. A third case is that of a lieutenant thrown from a motorcycle who suffered 
an R.A. dating back a week, at which time his unit was taking up positions along 
the coast in anticipation of an invasion from across the channel. The fourth case, 
that of a corporal thrown from his horse with complete amnesia for the preceding 
24 hours, must be considered a negative instance. He remembered reporting for 
sick call the preceding morning, but hardly anything thereafter. Nothing in the 
account signifies this event as the beginning of a new goal direction which was 
interrupted by the accident. 


Other cases like those mentioned can be found in the literature of the 
subject (2,4,6,12), even though the majority of them are indeterminate due to 
the fact that the R.A. is reported only in terms of time without mention of the 
circumstances of the origin point. In addition, patients receiving shock therapy 


in mental hospitals seem often to show the same type of R.A. (1,5,7,14). The 
nature of R.A. reveals two prominent characteristics of memory: (1) it possesses a 
unity beyond specific responses of remembering particular events, and (2) this 
unity of functioning is closely related to the motivational systems of the indi- 
vidual. That memory is functionally related to other parts of personality is, of 
course, no new idea in psychology. R.A. is akin in some ways to the amnesias of 
fugue states and dissociated personality. It seems to be the opposite of the 
Zeigarnik effect, where the interruption of highly motivated tasks results in a 
heightening of memory. The association of traumatic R.A. with motivation 
emphasizes certain inadequacies in current theory. Chief of these is the fruitless- 
ness of trying to arrange causes of amnesia along a continuum from psychogenic 
to organic, as Rapaport does (8). Traumatic R.A. is most often initiated by a blow 
on the head with possible injury to brain tissue, but this does not prevent the 
resulting memory disturbance from following psychological variables. Both the 
trauma and the interrupted motivational behavior segment combine to determine 
the result. 

Aside from the generality of its factual basis, certain difficulties attend the 
conception of a motivational behavior segment. To the investigator, human 
behavior appears as continuously motivated, one goal merging into, or forming 
a part of, a more remote goal. This means that it is difficult to determine when 
one segment has ended and another begun; new purposes may be conceived or 
planned some time before they are actually undertaken. In the case of Bill, for 
example, the goal of going roller skating had been planned before the picnic 
ended. However, the R.A. in this case, as in so many others, begins at a point 
close to the beginning of physical movement towards the new goal. This provides 
a fairly objective criterion for distinguishing one segment from another. 


Furthermore, several goals may be pursued concurrently. The picnic party 
was really on its way home, but before this it intended to go roller skating. In 
addition, Bill had the immediate goal of overtaking the other automobiles, and 
perhaps this was the most important factor in the amnesia. However, since all 
these new goals began operating at approximately the same time there is little 
use in trying to separate them. In general, we assume that if a set of goal direc- 
tions operate harmoniously together, they may be considered empirically as 
comprising parts of one motivational behavior segment. 


Since traumatic amnesia is an injury to the person, to test it directly is 
obviously impossible. It might be done indirectly through independent predictions 
on carefully compiled case histories, judged according to agreed-upon criteria. 
The hypothesis might also be tested on shock therapy patients in mental hos- 
pitals. It must be counted a merit of this type of hypothesis that it is stated in 
behavioral terms capable of being checked, rather than in terms of “settling” 
or “consolidating” processes of the brain, which no one knows anything about. 
The latter has been the traditional explanation of R.A. 
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PERSPECTIVES IN PSYCHOLOGY 


V. PSYCHOLOGY AND THE HISTORICAL SENSE’ 


PAUL SWARTZ 


University of Wichita 


The past few decades have witnessed a number of very significant develop- 
ments in the science of human behavior. Psychology, which has not yet shaken 
off the yoke of mentalism and which continues to construct a world picture 
based on the Lockean dichotomy of primary and secondary qualities, seems 
powerfully inclined in the direction of a new scholasticism, one that is based 
not on the authority of a medieval divine but on the “ideal of narrow exactitude,” 
to use Karl Mannheim’s phrase (5),— the desire to reduce all behavior to a 
“measurable or inventory-like describability.” In pursuit of this ideal the primary 
concern of the psychologist is with method, and the stress is on quantitative, to 
the neglect of qualitative, analyses of behavior. With great perceptiveness, 
Mannheim has described the new psychology as a move towards “mechanistic 
dehumanization and formalization,” i. e., a way of thinking in which all that is 
“only meaningfully intelligible” is excluded. It is behaviorism, he writes, which 

has pushed to the foreground this tendency towards con- 
centration on entirely externally perceivable reactions, and 
has sought to construct a world of facts in which there will 
exist only measurable data, only correlations between series 
of factors in which the degree of probability of modes of 
behavior in certain situations will be predictable (5, p. 43). 


Mannheim’s observations were made in the 1930’s and were published in 
the English language edition of Ideology and Utopia. As a commentary on 
behaviorism and the type of research it generates, his remarks are just as 
apposite today as they were when first written. Lamentably, the evolution of 
psychology as a science has not yet given the discipline a broad enough base 
in general social science to make psychologists feel the need for taking a 
critical historical perspective in their work or caused them to recognize the 
limitations placed upon the study of behavior by failure to develop a sociology 
of psychological knowledge. Modern psychologists, in short, are notoriously 
insensitive to the methodological and theoretical failings of their predecessors 
and show a disturbing tendency to perpetuate their errors. 


Consider, for example, the current interest in developing mathematical 
models of human behavior. Leaving aside such problems as the relative impor- 
tance of qualitative vs. quantitative analyses of behavior and the tenability of 
assumptions made by the model maker about behavior, what is the significance 
of this effort as viewed from an historical perspective? One psychologist who 
has asked this question is Dr. Paul Lazarsfeld of Columbia University. His observa- 
tions appear as the concluding remarks of the 1954 Dunlap Symposium (4). 
1In psychology’s contemporary state of flux a number of discrepant viewpoints co-exist. The present 


viewpoint is offered in competition with more conventional or prevalent conceptions. The editor invites 
(in fact, urges) readers to submit expressions of different viewpoints. 
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Lazarsfeld begins his brief statement with the cautionary note “that there is 
a large body of work going on in today’s social sciences which neither needs nor 
cares for mathematical models because it is in a field which is in no way yet 
ready for formalizations of the kind which mathematical models can introduce” 
(p. 97). Social sciences at present, he argues, can benefit only at selected points 
from mathematical models. 


The specific questions with which Lazarsfeld deals in this article are: 1) what 
are the scientific tasks which mathematical models can perform in the broad 
area of social science, and 2) how is the choice of the specific area in which we 
develop models made?” In response to the first question, Lazarsfeld recognizes 
a predictive and a linguistic function of models. The latter is divided into the 
organizing function, the analytical function, and the mediating function. “What 
| have called the linguistic function of a mathematical model,” he writes, “helps 
to organize an abundance of material, helps to pin down many a deficiency of 
data, and helps to mediate between procedures that are formally alike but so 
terminologically different between one discipline and another that you never 
really get mutual help” (p. 101). 


It is Lazarsfeld’s response to the second of the two questions that is of most 
immediate concern to us. How does one choose the area in which to develop a 
model? Is the criterion one of significance of behavior or is it one of ease of 
model construction? In a summary of the symposium proceedings the author 
leaves no doubt as to his answer to this question. 


“In the last two days,” Lazarsfeld comments, “we have mainly talked about 
gambling. It bothers me a little,” he continues, “because the decision problems 
| and most of my colleagues deal with are different: why do people commit 
crimes, why do they buy cars and why do they vote for Eisenhower? This is 
where social scientists continuously study decision problems. But, when it comes 
to model construction, the only problem that mathematicians seem attracted to 
is why people bet” (p. 102). 

We are in the act, suggests Lazarsfeld, in our model work of repeating “the 
danger of the behavioristic heretic of 1920 and 1925.” In accordance with the 
criterion of simplicity of situation, the Watsonian psychologist elected to study 
the learning process through observing the behavior, not of the human, but of 
the rat. Model makers accept the same criterion. “The rat maze and the betting 
experiment,” the author writes, “are characterized by the same intellectual turn 
to the simplest configuration. | think that no one,” he adds, “knows what the 
right turn is in terms of the development of the science, whether you had better 
stick to the more complicated situations or whether you should turn to the simple 
ones” (p. 102). 

There is a third problem and, of course, a much more basic one, which 
Lazarsfeld treats of briefly. This is the question of whether to study behavior by 
“Experiment” or by “Observation.” He writes (p. 102): 


| have also never been convinced in accounts of the history 
of psychology that this strong emphasis on experimentation 
is justified. | would like, briefly, to say what | mean. You can 
experiment with how people bet. But if you want to know how 
people vote, and how people buy, | don’t think you can 
really experiment. You have to do systematic analytical 
observations . . . 


Few modern psychologists, to judge by current experimental and theoretical 
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developments, share Lazarsfeld’s orientation. To the present writer, however, 
his position is a very compelling one. The recent history of psychology does 
reveal a type of chronic intellectual myopia to be an occupational hazard of 
our profession. Regard, for example, the following observation by Mannheim 
(5, p. 51-52), made over two decades ago. Is its validity any more widely 
realized and acted upon today than it was in the 1930's? 


. . . For it is not to be denied that the carrying over of the 
methods of natural science to the social sciences gradually 
leads to a situation where one no longer asks what one 
would like to know and what will be of decisive significance 
for the next step in social development, but attempts only 
to deal with those complexes of facts which are measurable 
according to a certain already existent method. Instead 
of attempting to discover what is most significant with the 
highest degree of precision possible under the existing cir- 
cumstances, one tends to be content to attribute importance 
to what is measurable merely because it happens to be 
measurable. 


If there is wisdom in this observation, should we not heed it? How often have 
psychologists “borrowed” the methods and theories of other sciences, only to 
discover, like Cinderella’s sisters, that the slipper does not fit the foot! To cite 
just one recent example, we refer the reader to Frick’s expression of indecision 
regarding the benefit to psychology of attempting to apply the methods of 
engineering communication to problems of human behavior (2). As Swartz and 
Pronko (7) suggest, apropos of this article, the theme of borrowed theories and 
borrowed methods is a dominant motif in the evolution of our science. 


The psychologist’s failure to develop an historical sense is complemented 
by his lack of interest in the history of psychology. Who does research in the 
evolution of psychology today? Very few indeed! Is this because, as one 
psychologist-friend declared, the definitive work in the history of psychology 
has already been written? In the absence of a mature sociology of psychological 
knowledge this belief seems hardly justified. No one who reads extensively in the 
history of psychology can fail to recognize the many gaps in our knowledge 
of the relationship between general cultural factors and the evolution of theories 
of behavior. What do historians of psychology have to tell us, for example, of 
the origin and early development of psycho-physical dualism? Is the first clearly 
defined separation of mind and body to be ascribed to Plato, as some writers 
suggest? Does it antedate him? Or is it not until Descartes, as Dampier (1) 
maintains, that complete dualism was first formulated? For Kantor (3, p. 256), 
“. . . the intangible and invisible soul, believed in especially because it could 
not be seen,” and which “became the occult source of all psychological powers 
and gave rise to the momentous tradition of mind-body which to this day domi- 
nates psychological and general scientific thinking,” was “unknown in Hellenic 
times.” It was the Neoplatonists, he writes, who “set the stage for the spiritistic 
view of man.” 


It is in the insights and observations of the historian, the political scientist, 
the economist, the sociologist and other specialized students of society that we 
will ultimately find the necessary knowledge for defining the cultural matrix 
which gave rise to dualism, nurtured it, and caused it, finally, to be institu- 
tionalized. With this knowledge the psychologist’s appreciation of the traps and 
pitfalls threatening the scientific enterprise will be enhanced, and his ability to 
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avoid them improved. Similarly, any added knowledge of the cultural events 
surrounding the historical development of psychology can facilitate a more 
objective and fruitful approach to the problems of human behavior. 


No one contemplating research in the evolution of psychology need fear a 
dearth of topics for investigation. Problem areas abound in great number. This 
is not to suggest, of course, that nothing is known of the history of our science. 
In fact, as even a casual perusal of modern textbooks will reveal, we actually 
know a good deal about the subject. But the story is a very incomplete one, 
particularly in its sociological aspects and in evaluative interpretations. The 
danger is that it will remain incomplete, not because the problems are insoluble 
but because of a lack of interest in them. 


It is easy to be a critic, especially in a young, immature field of study like 
psychology — much easier, as Disraeli observed, than to be correct. Let us end 
this brief commentary, then, with a positive suggestion. We recommend the 
encouragement and acceptance of graduate dissertations in the history of 
psychology. To insist upon an experimental-type dissertation at all levels of 
graduate study is simply another sign of psychology’s chronic intellectual myopia. 
It is a manifestation of our lack of an historical sense. What a giant step forward 
it will be when we are sufficiently mature to appreciate the value and need of 
encouraging scholarly research in the history of psychology! 
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A TEST OF SPENCE’S THEORY OF INCENTIVE 
MOTIVATION’ 


CAROLYN F. SWIFT AND EDWARD L. WIKE 


University of Kansas 


In the published version of the Silliman lectures Spence (4) has proposed 
an explicit theory of instrumental conditioning. According to this viewpoint, 
learning (H) is a function of (N) the number of training trials (4, p. 93) and, all 
other things being equal, excitatory potential (E) is believed to be a multiplica- 
tive function of learning (H) and incentive motivation (K). Spence’s position 
departs from Hull’s revised theory (1) in two important respects: first, reinforce- 
ment is not a necessary condition for learning for Spence; and, second, the role 
of the classically conditioned, fractional anticipatory goal response (r,) and its 
interioceptive stimulus (s,) underlying incentive motivation (K) has been devel- 
oped in detail in Spence’s formulation. 


According to Spence’s theory of incentive motivation, the consummatory 
response (Rg) becomes classically conditioned to the apparatus cues (Sa), with 
the reward, e.g., food, serving as the unconditioned stimulus. Through generaliza- 
tion, cues (Sql) earlier in the sequence serve to elicit noncompeting portions of 
Rg and the attendant stimulation, i.e., rg-sg. The occurrence of rg is hypothesized 
to increase the motivational level of the organism either through conflict, and 
resulting tension, or by means of the intensity of s, in manner like Hull’s stimulus 
intensity dynamism (4, p. 135). Spence has further suggested that the vigor of 
occurrence of r, and, accordingly, the magnitude of K, will be a function of the 
number of classical conditioning trials, the amount and qualitative properties of 
the incentive, and the similarity of goal box cues to cues earlier in a runway. 


Spence’s theory has been tested by Stein (5), who contrasted the test trial 
performance on a runway of a control group (C) of rats, which received rewarded 
runway trials, with the performance of three experimental groups (E) which were 
fed in the goal box after being given a series of nonrewarded trials. One E 
group was given 10 food pellets per feeding, another had 2 pellets per feeding, 
and a third group was fed 2 pellets per feeding in a goal box which was dis- 
similar to the runway. On the ensuing test trials, the C group ran more rapidly 
than the E groups, and the latter groups did not differ among themselves in their 
running speeds. 


One possible interpretation of these negative results, that r, occurred in 
the E groups ”. . . in such a degree that it competed with the instrumental 
locomotor act on the test trial” (5, p. 273), was evaluated in a further experi- 
ment. The crucial E group in this study received its goal box feedings 15 min. 
after the nonrewarded runway trials, so as to facilitate the occurrence of rz on 
trials before the initial test trial. While this crucial E group did run faster on the 
initial test trials than a group that had direct goal box feedings prior to test 
trials, it did not outperform a group which was nonrewarded throughout. In 
summary, Stein’s results were not in accord with those aspects of the incentive 
motivation theory which were being tested. 


1A portion of an M.A. thesis which was directed by the junior author. 
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The purpose of the present experiment was to test: (1) the hypothesis that 
K is a function of the number of classical conditioning trials, and (2) the hypothesis 
that K is lower when the classical conditioning trials occur in a goal box which 
is dissimilar to the rest of the runway apparatus. The essential procedural differ- 
ence between this experiment and Stein’s was that in the present study the goal 
box feedings took place after exploration of the runway only, and H, therefore, 
was lower than in the first Stein experiment, where Ss had had 9 runway triais 


prior to such feedings. 
METHOD 


Subjects 

The Ss were 48 hooded male rats. Twenty-three were from the University 
of Kansas colony; the other 25 were secured from a local vendor. The Ss were 
naive, ranged in age from 150 to 180 days, and were assigned by using a table 
of random numbers to the four experimental conditions described below. One 
S died during the test period. 


Apparatus 

The runway apparatus consisted of a straight alley, whose dimensions were 
58 in. x 5.75 in. x 6.5 in. Attached to one end was a 12 in. starting box, which 
was separated from the runway by a Plexiglas guillotine door. The goal box, 
which was 17 in. x 11.25 in. x 6.5 in., was attached to the other end of the alley 
and was separated from the alley by a Plexiglas guillotine door. These doors 
were operated manually. The entire apparatus was covered with 0.25 in.-thick 
Plexiglas and the interior portion of the apparatus was painted black. The dis- 
similar goal box was 12 in. x 13.5 in. x 6.5 in., was painted white, and had a 
hardware cloth cover. 

Two response measures were taken on the test trials: latency and total 
running time. Latency consisted of the time it took S to leave the starting box 
and locomote to a point 18 in. down the runway. Total time was the time between 
the elevation of starting box door and lowering of goal box door. The times 
were obtained from two Standard timers, and a photo-electric cell arrangement 
stopped the latency timer when S’s passage interrupted the light beam 18 in. 
from the starting door. 


Procedure 

Pre-training. — Days 1-8. The Ss were tamed by daily handling and placed 
on a 23-hr. food deprivation cycle. This level of deprivation was maintained 
throughout the experiment. On Day 8 Ss explored the empty runway in groups 
of four for a 10 min. period. One-half hour after each day’s procedures Ss were 
fed Purina Fox Checkers for 1-hr. in their living cages, where water was always 
available. 

Training. — Days 9-25. Group One: Ss were given three individual direct 
feedings daily in the goal box of the apparatus. The goal box contained two 
small Purina Layena pellets, placed in a food dish. Each S was removed from 
the goal box 30 sec. after it began to eat. If S did not eat, it was removed from 
the goal box after 60 sec. One direct feeding was given to each S in rotation, 
repeating the procedure three times. 

Group Two: On the first 14 days of this period, Ss were taken to the experi- 
mental room and handled for the same amount of time that the Group One Ss 
had been handled. On the last three days of this period, Ss had three direct 
feedings daily in the goal box in the same manner as Group One. 
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Groups Three and Four: The procedures duplicated those for Groups One 
and Two, respectively, except thai the direct feedings took place in the dis- 
similar goal box. 


Testing.— Days 26-31. All Ss were given one rewarded runway test trial daily. 
Time allotted in the goal box and amount of reward were the same as described 
above under Training. The Ss failing to leave the starting box in one min. were 
placed directly on the runway, immediately beyond the release door of the 
starting box. If an S loitered on the runway more than two min., it was placed 
directly in the goal box. It should be noted that the black, similar goal box 
was present at the end of the runway for all groups during the test period. 


RESULTS 


Since the distributions of both time measures revealed a moderate degree 
of positive skewness, the data were transformed into square roots (3). Following 
Lindquist’s suggestion (2), the missing data problem created by the death of an S 
in Group Two was met by substituting the mean scores for the remaining Group 
Two Ss. The transformed data were then subjected to a repeated measurements 
analysis of variance for a factorial design, the correlated variable being trials 
and its interactions, and the independent variables being goal boxes and the 
number of conditioning trials.2 The results for the transformed running time scores 
were identical for both the first 3 days of test and the total 6-day test period — 
only trials was a significant source of variation: F— 40.937; df—2,86; P<.01 and 
F=32.755; df—5,215; P<.01. These outcomes showed merely that learning 
occurred and provided no statistically significant evidence in favor of hypotheses 
being tested. 


The analysis of the transformed latencies again disclosed trials to be sig- 
nificant: F—=29.075; df—2,86; P<.01 and, in addition, the trials by goal boxes 
interaction was found to be significant: F—4.332; df—2,86; P<.05. Since this 
interaction, if it was intrinsic, had a definite bearing on the second hypothesis, 
it was examined further by a twice-corrected table of means (2, p. 118). Table 1 


TABLE 1 


THE TWICE-CORRECTED MEAN TRANSFORMED LATENCIES (SECONDS) FOR GROUPS ONE AND 
TWO (APPARATUS GOAL BOX GROUPS) VS. GROUPS THREE AND FOUR (DISSIMILAR GOAL 
BOX GROUPS) FOR THE FIRST THREE TEST TRIALS. 


TRIAL 
GROUP 1 2 3 
APPARATUS GOAL BOX 4.32 4.06 3.61 
DISSIMILAR GOAL BOX 3.68 3.93 4.39 


shows that on the first test trial Ss conditioned in the apparatus goal box ran 
slower than those fed in the dissimilar goal box, while on the third test trial the 
results were in the opposite and predicted direction. Subsequent Tukey gap tests 
revealed both of these differences to be statistically significant (P<.05). Un- 
fortunately, latencies were not available for testing after the third trial due to a 
failure of the latency timing mechanism. 


2We are indebted to Dr. Jack Michael, University of Houston, for statistical assistance. 
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DISCUSSION 


Analysis of the latency data revealed one relevant significant F-ratio, trials 
by goal boxes interaction, which lends partial support to Spence’s theory of 
generalization of r,. According to this theory, Groups One and Two, which 
were fed in the apparatus goal box during the training period, should have 
performed better than Groups Three and Four which were fed in the dissimilar 
goal box. On the initial test run the groups which had conditioning trials in the 
apparatus goal box did not perform as well as those which had been fed in the 
dissimilar goal box. Although Ss with feeding experience in the apparatus had 
significantly longer latencies on the first test trial than those fed in the alternate 
goal box, the former improved their performance at a more highly accelerated 
rate than the latter, and on the third test trial the positions of the groups were 
reversed. That is, on Day 3 of the test period Ss with direct feeding experiences 
in the apparatus had significantly shorter latencies than Ss whose direct feeding 
experiences took place in the alternate goal box. With the exception of the 
relative positions of the groups on the initial test trial, the evidence supports the 
theory of generalization of the antedating goal response proposed by Spence. 


The fact that there was a significant difference between the groups for the 
first trial suggests the possibility that either certain variables were operating to 
inflate the performances of the dissimilar goal box Ss, or certain variables were 
operating to depress the performances of the apparatus goal box Ss. Since we 
have no rationale for the former alternative, an examination of an explanation 
consistent with the latter possibility is in order. 


Efforts to facilitate generalization in Groups One and Two by maximizing 
the similarities between the goal box and the rest of the apparatus may have 
been successful to the extent that these animals failed to discriminate the starting 
box from the goal box, with the result that conditions appropriate to extinction 
were present on the first test trial. It is not unreasonable to assume that the longer 
latencies exhibited by these groups on the first test trial were partially due to 
responses such as “looking for food” which were incompatible with the running 
response to be learned. Since the starting box was dissimilar to the alternate 
goal box in which Groups Three and Four were fed, one would expect fewer 
such incompatible responses on the part of these Ss. If this hypothesis is correct, 
then the performance of Groups One and Two on the initial test trial could be 
expected to be depressed relative to the performance of Groups Three and 
Four in which the probability of generalization from the training trials to the test 
trials was minimal. While this explanation is admittedly an ad hoc theory, it could 
be readily tested by introducing discrimination training involving the starting 
box and the goal box during the direct feeding procedure. 


Finally, no evidence was found to support the first hypothesis: namely, that K 
would be higher when the number of conditioning trials was larger. Also, it 
should be noted that Stein (5) failed to observe that his Ss with a large amount 
of reward exceeded the performance of Ss with a small amount of reward in 
the direct feedings. The possibility exists that both tests were confounded by the 
failure of Ss to discriminate the starting box from the goal box and the conse- 
quent interference with the development of the locomotor habit. 


SUMMARY 


Two implications of Spence’s theory of incentive motivation were tested 
using four groups of rats and a runway apparatus. Groups One and Two had 


ed 
ad 


51 and 9 direct feeding experiences, respectively, in the runway goal box, while 


Groups Three and Four had the same number of direct feedings in a dissimilar - 


goal box. Following this training designed to condition r,, all Ss had six rewarded 
trials in the runway terminating at the runway goal box. Hypothesis (1), that K 
is a function of the number of classical conditioning trials, was not substantiated. 
Some support was found for the second hypothesis, that K is greater when con- 
ditioning occurs in a goal box which is similar to the rest of the runway, since Ss 
in Groups One and Two had shorter latencies than Ss in Group Three and Four 
on the third day of testing. The inferior performance of the former groups on the 
initial test trial was attributed to the failure of Ss to discriminate the starting 
box from the goal box and attendant occurrence of responses which were 
incompatible with running. A proposed test of this hypothesis was suggested. 
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